The effect of SARS-CoV-2 variant on respiratory features and mortality

SARS-CoV-2 (COVID-19) has caused over 80 million infections 973,000 deaths in the United States, and mutations are linked to increased transmissibility. This study aimed to determine the effect of SARS-CoV-2 variants on respiratory features, mortality, and to determine the effect of vaccination status. A retrospective review of medical records (n = 55,406 unique patients) using the University of California Health COvid Research Data Set (UC CORDS) was performed to identify respiratory features, vaccination status, and mortality from 01/01/2020 to 04/26/2022. Variants were identified using the CDC data tracker. Increased odds of death were observed amongst unvaccinated individuals and fully vaccinated, partially vaccinated, or individuals who received any vaccination during multiple waves of the pandemic. Vaccination status was associated with survival and a decreased frequency of many respiratory features. More recent SARS-CoV-2 variants show a reduction in lower respiratory tract features with an increase in upper respiratory tract features. Being fully vaccinated results in fewer respiratory features and higher odds of survival, supporting vaccination in preventing morbidity and mortality from COVID-19.

Variant and vaccination status. The UC CORDS data set did not report variant type; therefore, variants were identified based upon dates when each variant was dominant as reported by the CDC data tracker 2 . Although the CDC data tracker contains national-level data, the data from the California Department of Public Health in daily trends of the number of COVID-19 cases was not remarkably different to warrant focusing on California-specific data, as shown in the Supplementary Information Figs. 1 and 2 2,14 . Moreover, this method was also previously used by Wang et al. 3 to classify COVID-19 patient in their analysis of outcomes for the US population; although it may lack the precision of including variant-confirmed speciation from laboratory testing, it is nonetheless a measure that reflects the dominant variant waves that infected the US population at ongoing timepoints throughout the pandemic 3 . Accordingly, the date ranges extended from 01/01/2020 to 06/30/2020 for the Founder variant, 06/30/2020 to 05/31/2021 for the Alpha variant, 06/01/2021 to 11/30/2021 for the Delta variant, and from 12/01/2021 to 04/26/2022 for the Omicron variant. Although vaccines were not available for COVID-19 infection until December 2020, SARS-CoV-2 variants were included in this study prior to the vaccine being available were included primarily to depict the evolutionary changes in symptom presentation over time with each variant, rather than to focus on a comparison of fully vaccinated, partially vaccinated, or unvaccinated status. Patients who received at least two doses of the vaccine before their positive test result were considered fully vaccinated. Patients who received one dose of the vaccine before their positive test result were considered partially vaccinated, and those who received no vaccine were considered unvaccinated.
Inclusion and exclusion criteria and sample. The study involved review of electronic health record data in the UC CORDS data set. Inclusion criteria for this study included all patients, regardless of age, who had a positive test anywhere in the hospital setting (i.e. emergency department, intensive care unit, or any other hospital unit). For any given positive test, respiratory features data was included in a window of 5 days prior to the test result and up to 30 days after a positive RT-PCR test for SARS-CoV-2. Exclusion criteria for the study included those whose data was obtained from non-hospital (i.e. clinic-based) outpatient settings and those who had a positive RT-PCR SARS-CoV-2 test outside of the predetermined window.
Demographic data were obtained by searching the data set for each demographic variable of interest, including age, gender, and race/ethnicity data. Race and ethnicity data were included to ensure the data set included a representational sample of the general population in California. Comorbidity data was obtained by searching the data set for the 200 most common ICD-10 codes listed for the patients and then filtered further by including duplicate terms (e.g. "chronic obstructive pulmonary disease (COPD)" and "COPD") and removing terms that were not pertinent to this study (e.g. pregnancy).
Respiratory feature identification and extraction. The 40 most reported features across all body systems were extracted from each variant through a query in the UC CORDS data set. Forty features were chosen as an initial starting point for assessing the number of features based on historical work that found features may range from as few as 17 15 to as many as 50 features 16 . In this study, the total number of features per variant ranged from 27 features during the Founder wave and up to 34 features during the Delta variant. Therefore, there were no remaining features undiscovered from the search results.
The preliminary search to identify prominent respiratory features involved running a query for the top 40 most frequent ICD-10 codes amongst all the patients in the data set. The term "features" was substituted for ICD-10 codes to account for the differing nature of the results; for example, some ICD-10 codes are medical diagnoses, (e.g. acute respiratory failure or pneumonia) while others may be symptoms that a patient reports www.nature.com/scientificreports/ (e.g. cough or nasal congestion), while others may be signs that a medical provider assesses (e.g. dyspnea). The feature selection was then compared with historical work by others which demonstrated the most prevalent signs and symptoms (i.e. "features") that affected people with acute and post-acute SARS-CoV-2 infection 15,16 to assess face validity and consistency with prior work of the retrieved data. Of these most reported features across each variant, the non-respiratory features were classified as "unclassified" and the remaining respiratory features were assigned classification into lower and upper respiratory features through expert consultation and discussion amongst the research group. All features, their ranking, and their classifications are provided in a tabular format in Supplementary Information Tables 1, 2, 3 and 4. The reported frequency of each feature was then normalized per 100 cases.
Statistical analysis. Statistical analysis was performed using odds ratios to determine the risk of death for each variant while accounting for vaccination status and adjusted odds ratios were calculated while controlling for age and gender. In this study, age was represented as a continuous variable and the reference category for gender was men. The two-tailed Chi-square test was used to study the effect of respiratory symptoms of COVID-19 on vaccination status across variants. Chi-square tests are used to study the relationship between categorical variables using a contingency matrix. The tests compared the relationship between the frequency of patients who did and did not report a particular symptom. The contingency matrices were created for each variant separately and they each compared the frequency of a particular symptom between fully and not fully vaccinated patients.
A p-value of < 0.05 was considered statistically significant. Analyses were conducted using Python (version 3.6) and the SciPy package (version 1.8.0). Because group sizes between the fully vaccinated and partially vaccinated were disproportionate, balancing of the groups was done using the SKlearn (version 1.1.3) package. This was done by randomly sampling patients from the larger group to match the size of the smaller group. This subsampled data was then placed into a logistic regression model using the SKlearn package with age and gender as covariates to obtain adjusted odds ratios for mortality.

Results
Demographics and comorbidities. Tables 1, 2, 3 and 4 provide demographic information for the 55,406 patients included based on variant and vaccine status (Founder (n = 2319), Alpha (n = 16,753), Delta (n = 7280), and Omicron (n = 29,054)). Across all variants, the fully vaccinated population was the oldest, followed by the partially vaccinated, and finally with the unvaccinated population being the youngest. Additionally, each of the major variant waves included more females than males, and the sample was predominantly White, followed by Hispanic or Latino, and then Asian, African-American, Native Hawaiian or Pacific Islander, American Indian or Alaskan Native, and Unknown or other. Tables 5, 6, 7 and 8 provide patient comorbidity data delineated by variant. Across all variants, the fully vaccinated population had a higher frequency of chronic conditions such as anemia, atrial fibrillation, COPD, cancer, gastrointestinal reflux disease (GERD), heart failure, hypertension, immunocompromised, kidney disease, obesity, and type 2 diabetes mellitus.    Table 3. Demographic information of patients with a positive SARS-CoV-2 result during the Delta variant wave (n = 7280 SARS-CoV-2 infections).

Fully vaccinated (n = 1542) Partially vaccinated (n = 493) Unvaccinated (n = 5245)
Age, years ( Mortality. Tables 9, 10 and 11 provide mortality data based on vaccination status for all patients in the study across the four major variants. Table 9, which compares mortality data between fully vaccinated and unvaccinated individuals, shows that there was not a statistically significant difference between the groups regarding mortality in the unadjusted analysis. However, in the adjusted analysis, the odds of death for unvaccinated individuals reached significance during the Delta and Omicron waves. During the Delta wave, the adjusted odds ratio of mortality for unvaccinated individuals was 1.21; during the Omicron wave, the adjusted odds ratio of mortality for unvaccinated individuals was 1.17. Additionally, there were substantially more patients in the unvaccinated group (n = 1344) than the fully vaccinated group (n = 158). Table 10 shows mortality data across all variants based on partially vaccinated or unvaccinated status. In the unadjusted analysis, during the Alpha and Delta waves, unvaccinated individuals had a significantly higher likelihood of mortality compared with partially vaccinated individuals (Alpha OR: 1.98, p = 0.022; Delta OR: 3.26, p = 0.009). However, in the adjusted analysis, there were no statistically significant differences in mortality between partially vaccinated and unvaccinated individuals. Similarly, there were substantially more individuals in the unvaccinated group (n = 1381) compared with the partially vaccinated group (n = 51). Table 11 shows the mortality data across all variants between individuals who were either fully vaccinated or partially vaccinated compared to unvaccinated individuals. In the unadjusted analysis, during the Delta wave, unvaccinated individuals had higher risk of death compared to individuals who were either partially or fully vaccinated (Delta OR: 1.62, p = 0.012). In the adjusted analysis, unvaccinated individuals were significantly more likely to die compared with individuals who received any vaccination (Delta adjusted OR: 1.17, p = 0.024; Omicron adjusted OR: 1.2, p = 0.003). Again, there were a higher number of individuals in unvaccinated group (n = 1344) than either the partially or fully vaccinated group (n = 209).
Effect of vaccination status. The effect of vaccine status on respiratory features for each variant wave was assessed in Tables 12, 13 and 14. Across all variants, a total of 11,556 individuals were fully vaccinated, 4207 were partially vaccinated, and 39,646 were unvaccinated. Tables 12 and 13 show that unvaccinated individuals have an increased odds of having many upper and lower respiratory features. Also, Table 14 shows that unvaccinated individuals have increased odds of many upper and lower respiratory features compared with individuals who received any vaccination.

Discussion
Acute COVID-19 infection causes numerous respiratory disorders, and as such, it is necessary to investigate its impacts across the respiratory system as new variants have emerged. However, because of the predilection for SARS-CoV-2 in causing impairments to the lungs and respiratory system, this study particularly focused on assessing the direct consequences to the respiratory system over the course of the pandemic by examining the frequency of the most common features related to the most dominant and prevalent SARS-CoV-2 variants. Additionally, this study sought to assess the effect of vaccination status on respiratory features. A few observations warrant additional discussion. First, during the Delta and Omicron waves, there was a statistically significant difference in mortality between fully vaccinated and unvaccinated individuals. For partially vaccinated individuals, there was a significant reduction in mortality during the Alpha and Delta waves compared with individuals who were unvaccinated. Additionally, after combining individuals who either were fully vaccinated or partially vaccinated and comparing them with unvaccinated individuals, there were statistically significant reductions in mortality during the Delta and Omicron waves. The adjusted analyses showed that age and gender are confounding variables in the relationship between the independent variable (i.e. vaccination status) and the outcome variable (i.e. mortality). These findings also correspond to the findings reported in Tables 12 and 13, in which there were also significant differences in many prominent features such as pneumonia, acute respiratory failure, and hypoxemia between the fully vaccinated or partially vaccinated individuals and unvaccinated individuals.
Second, a major source of mortality of COVID-19 disease includes acute respiratory distress syndrome (ARDS) and respiratory failure. We show that as variants evolve there is a reduction in lower respiratory tract features, such as pneumonia, hypoxemia, and acute respiratory failure. This finding may be due to increasing rates of vaccination, a reduction in virulence with successive variants, improvement in management of care for patients with acute COVID-19 infection, acquisition of immunity among those reinfected or a combination of any of these factors. The drastic reduction in the lower respiratory symptoms of pneumonia across variants may best demonstrate this phenomenon-during the Founder phase, pneumonia was reported in 36.83 per 100 cases; however, during the Omicron phase, the frequency of pneumonia was reduced to 3.59 in fully vaccinated patients and 4.42 in unvaccinated individuals. Moreover, the statistically significant difference between the fully vaccinated, partially vaccinated, and unvaccinated patients in frequency of pneumonia supports the evidence regarding the immense benefits associated with vaccination in preventing severe disease. Although the reduction in lower respiratory features and increased mortality observed during the Delta period may appear contradictory, this is only evident if it is assumed that the individual died from ARDS; unfortunately, the UC CORDS data set does not contain information regarding the specific cause of death, so it is difficult to determine whether Table 6. Comorbidity information of patients with a positive SARS-CoV-2 result during the Alpha variant wave (n = 16,753 SARS-CoV-2 infections) expressed as frequency percentage.   3 , which showed that the Omicron variant was associated with less likelihood of 3-day risk of emergency department visit, hospitalization, intensive care unit admission, and mortality, because of Omicron being less virulent in causing lung-related disease 3 .
In contrast to the decreasing lower respiratory symptoms observed in the later SARS-CoV-2 variants, there was a notable increase in the trend of upper respiratory symptoms in this study. In particular, the features of acute pharyngitis, acute upper respiratory infection, and cough all either increased or remained elevated during more recent stages of the pandemic. These findings suggest that infection with more recent SARS-CoV-2 variants produces more upper airway features than lower airway features. More research is necessary to conclude that this is the case, but these findings are congruent with studies involving animals 17,18 . Limitations. This study is not without limitations. First, the nature of the retrospective design is biased towards those who either sought care or required hospitalization for COVID-19; thus, these data will not include information regarding patients who did not seek care. Second, this study did not account for the timing of when patients received vaccination and when they became ill with COVID-19. The possibility remains that patients may have received the full doses of the vaccine, but perhaps developed COVID-19 prior to the time required by their body to develop sufficient antibodies. Moreover, UC CORDS does not account for severity of illness when a patient tested positive for COVID-19, it is plausible that a patient may have tested positive for COVID-19 but had few symptoms or may have had many. Third, although the UC CORDS data set contains the records of over 2500 patients infected with the Founder variant, at this stage of the pandemic, the virus was still a novel phenomenon and the UC CORDS database had not yet been fully set up; therefore, the records of some of the patients infected with the Founder variant may be lacking or missing. Fourth, symptom selection was determined through expert identification of symptoms and corresponding ICD-10 code obtained from the Table 7. Comorbidity information of patients with a positive SARS-CoV-2 result during the Delta variant wave (n = 7280 SARS-CoV-2 infections) expressed as frequency percentage.   www.nature.com/scientificreports/ electronic health record, so there may be diagnoses which existed but were not captured in the UC CORDS database. Moreover, both "pneumonia" and "viral pneumonia" ranked in the top 40 of feature collection, but due to similarities in presentation, "viral pneumonia" was combined with pneumonia, thus raising the possibility that the actual frequency of pneumonia may have been slightly lower than what was captured in this study due to potential overlap. Fifth, genomic data of the viral strains is unavailable in the UC CORDS data set, and therefore it was necessary to rely on the information from the CDC's data tracker to determine time periods in      Future studies should examine the frequency data while controlling for the timing of vaccine administration. Furthermore, because vaccination guidelines changed throughout the pandemic, it is necessary to re-examine the data while using the latest vaccine guidelines (e.g. fully vaccinated in this study was considered to be at least two doses; however, patients may have received an adenovirus vaccine, which initially only required one dose, while other may have received a mRNA vaccine, which initially required two doses). As guidelines continue to shift and the virus continues to mutate, future research should take into consideration the changing guidelines and definition of what "fully vaccinated" is considered to mean. Additionally, as ongoing research demonstrates the significant long-term effects of COVID-19 infection on developing post-acute sequelae of SARS-CoV-2 infection (PASC), it is imperative to assess how acute COVID-19 infection manifests for patients in the long term. Of particular concern is of how the different variants may be associated with the development of PASC symptoms. Finally, as the COVID-19 pandemic continues to cause immense problems for patients and the healthcare system alike, it is essential to examine historical data to help inform present and future decision-making.

Conclusion
This retrospective study examined the frequency of respiratory features across four major variants since the outset of the COVID-19 pandemic. Additionally, patients were categorized based on vaccination status and mortality risk was assessed. This study found that there were significant reductions in the risks of mortality for patients who were vaccinated during the Delta period. Additionally, there are substantially fewer lower respiratory features associated with later variants, such as Omicron. Meanwhile, as the frequency of lower respiratory features has decreased, there is a substantial uptick in the frequency of upper respiratory features. This study also showed substantial favorable benefits in patients who are fully vaccinated compared with the unvaccinated or only partially vaccinated, the fully vaccinated population experienced significantly fewer features involving the upper and lower respiratory tract. This study indicates that because of numerous factors, including viral evolution, enhanced immunity, and likely improved treatment modalities, respiratory features involving the lower respiratory tract are reported with less frequency compared with earlier stages of the pandemic.